Check structural relationships

mediations

multilevel mediation

cross-lagged model

Test structural (equation) models

Author

Joachim Waterschoot

Published

December 5, 2023

Mediation

What is a mediation?

With mediation analyses, we want to check to what extent an predictor-outcome association is explained by their associations with a third variable, the ‘mediator’.

Path C refers to the association between the predictor and the outcome. This is the total effect.

flowchart LR
    A(Predictor) -->|path C| B(Outcome)

In a mediation, we want to check whether the mediator explains this total effect. Specifically, we aim to calculate the indirect effect (i.e., effect through the mediator between predictor and outcome) and the direct effect (i.e., what remains of the total effect after controlling for the indirect effect). The indirect effect is the multiplication of path A and path B. The direct effect is path C*.

flowchart LR
    D(Predictor) -->|path A| E(Mediator)
    D(Predictor) -->|path C*| F(Outcome)
    E(Mediator) -->|path B| F(Outcome)

The lavaan package allows us to extend this mediation analyses for multiple predictors, mediators and outcomes. In the current example, we want to check to what extent our condition effect (a dummy code for the conditions no choice versus choice) on the outcomes pleasure and interest, vitality, and intended persistence is explained by the mediators autonomy satisfaction and competence satisfaction.

flowchart LR
    A(Condition) --> |B1| B(Autonomy satisfaction)
    A(Condition) --> |B2| C(Competence satisfaction)

    A(Condition) --> |C1| D(Pleasure and interest)
    A(Condition) --> |C2| E(Vitality)
    A(Condition) --> |C3| F(Intended persistence)

    B(Autonomy satisfaction) --> |A1| D(Pleasure and interest)
    B(Autonomy satisfaction) --> |A3| E(Vitality)
    B(Autonomy satisfaction) --> |A5| F(Intended persistence)

    C(Competence satisfaction) --> |A2| D(Pleasure and interest)
    C(Competence satisfaction) --> |A4| E(Vitality)
    C(Competence satisfaction) --> |A6| F(Intended persistence)

In describing the model, we need to provide a unique label to each pathway, so we can use this in the calculation of the indirect and total effects.

library(lavaan)

SEM <- "
# predictor -> mediators (paths B)
Autonomy ~ B1*Condition.d
Competence ~ B2*Condition.d

# predictor + mediator --> outcome (paths A and C*)
Pleasure ~ A1*Autonomy + A2*Competence + C1*Condition.d
Vitality ~ A3*Autonomy + A4*Competence + C2*Condition.d
Intended_persistence ~ A5*Autonomy + A6*Competence + C3*Condition.d

# calculation of an indirect effect
B1A1 := B1*A1
B1A2 := B1*A3
B1A3 := B1*A5

B2A2 := B2*A2
B2A4 := B2*A4
B2A6 := B2*A6

# calculating total effect
totB1A1 := C1 + (B1*A1)
totB1A2 := C1 + (B1*A3)
totB1A3 := C1 + (B1*A5)

totB2A2 := C1 + (B2*A2)
totB2A4 := C1 + (B2*A4)
totB2A6 := C1 + (B2*A6)
"

fit <- sem(model = SEM, data = data)

summary(fit,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)

# useful when fit of model is not good:
modificationIndices(fit, sort.=TRUE, minimum.value=3)

Check the output

summary(fit,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)

lavaan 0.6.17 ended normally after 13 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        19

                                                  Used       Total
  Number of observations                            99         104

Model Test User Model:
                                                      
  Test statistic                                99.060
  Degrees of freedom                                 1
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               504.532
  Degrees of freedom                                15
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.800
  Tucker-Lewis Index (TLI)                      -2.005

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -581.632
  Loglikelihood unrestricted model (H1)       -532.102
                                                      
  Akaike (AIC)                                1201.264
  Bayesian (BIC)                              1250.572
  Sample-size adjusted Bayesian (SABIC)       1190.569

Root Mean Square Error of Approximation:

  RMSEA                                          0.995
  90 Percent confidence interval - lower         0.835
  90 Percent confidence interval - upper         1.166
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.178

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                         Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Autonomy ~                                                                  
    Conditn.d (B1)          1.193    0.160    7.432    0.000    1.193    0.598
  Competence ~                                                                
    Conditn.d (B2)          0.839    0.183    4.584    0.000    0.839    0.418
  Pleasure ~                                                                  
    Autonomy  (A1)          0.022    0.090    0.245    0.806    0.022    0.015
    Competenc (A2)          0.897    0.079   11.409    0.000    0.897    0.615
    Conditn.d (C1)          1.186    0.190    6.228    0.000    1.186    0.405
  Vitality ~                                                                  
    Autonomy  (A3)          0.110    0.095    1.154    0.248    0.110    0.082
    Competenc (A4)          0.871    0.083   10.435    0.000    0.871    0.651
    Conditn.d (C2)          0.673    0.202    3.329    0.001    0.673    0.251
  Intended_persistence ~                                                      
    Autonomy  (A5)          0.158    0.110    1.433    0.152    0.158    0.142
    Competenc (A6)          0.438    0.096    4.539    0.000    0.438    0.396
    Conditn.d (C3)          0.518    0.234    2.218    0.027    0.518    0.234

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .Pleasure ~~                                                           
   .Vitality          0.185    0.057    3.248    0.001    0.185    0.345
   .Intndd_prsstnc    0.210    0.066    3.197    0.001    0.210    0.339
 .Vitality ~~                                                           
   .Intndd_prsstnc    0.155    0.068    2.290    0.022    0.155    0.237

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Autonomy          0.634    0.090    7.036    0.000    0.634    0.642
   .Competence        0.825    0.117    7.036    0.000    0.825    0.825
   .Pleasure          0.505    0.072    7.036    0.000    0.505    0.237
   .Vitality          0.569    0.081    7.036    0.000    0.569    0.318
   .Intndd_prsstnc    0.759    0.108    7.036    0.000    0.759    0.623

R-Square:
                   Estimate
    Autonomy          0.358
    Competence        0.175
    Pleasure          0.763
    Vitality          0.682
    Intndd_prsstnc    0.377

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    B1A1              0.026    0.107    0.245    0.806    0.026    0.009
    B1A2              0.131    0.115    1.141    0.254    0.131    0.049
    B1A3              0.188    0.134    1.407    0.160    0.188    0.085
    B2A2              0.752    0.177    4.254    0.000    0.752    0.257
    B2A4              0.731    0.174    4.197    0.000    0.731    0.273
    B2A6              0.367    0.114    3.226    0.001    0.367    0.166
    totB1A1           1.212    0.158    7.691    0.000    1.212    0.414
    totB1A2           1.317    0.203    6.498    0.000    1.317    0.454
    totB1A3           1.374    0.211    6.507    0.000    1.374    0.490
    totB2A2           1.939    0.243    7.991    0.000    1.939    0.663
    totB2A4           1.917    0.252    7.613    0.000    1.917    0.678
    totB2A6           1.553    0.214    7.274    0.000    1.553    0.571

fitMeasures(fit, c("chisq", "df", "cfi", "rmsea", "srmr"))

 chisq     df    cfi  rmsea   srmr 
99.060  1.000  0.800  0.995  0.178

Note

The output shows that:

The model has an acceptable fit
Pathways between predictor and mediator are significant. Also, mediators are significantly related to outcomes.
All total effects were significant, with direct effects still being significant and 3 indirect effects being significant (through the mediator Competence). Here, we can talk about a partial mediation.

Tip

A useful online tool to visualize structural equation model is https://app.diagrams.net/

Multilevel Mediation

We can also check for such a mediation model in a multilevel way. Currently, this is possible up until two levels. Here, we check for a simple example:

ML_SEM <- '
level:1
OUTCOME1 ~ VAR1 + VAR2 
OUTCOME2 ~ VAR1 + VAR2 

level:2
OUTCOME1 ~ VAR1 + VAR2 
OUTCOME2 ~ VAR1 + VAR2 
'

fit <- sem(model = ML_SEM, 
           data = dflong, 
           cluster = "ID", # group variable including dependent variance
           optim.method = "em")

summary(fit,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)

modificationIndices(fit, sort.=TRUE, minimum.value=3)

Cross-lagged panel model

Between-subject

flowchart LR
    A(Predictor X - time 1) -->|autoregression| B(Predictor X time - 2)
    A(Predictor X - time 1) -->|cross-lagged| C(Predictor Y - time 2)
    D(Predictor Y - time 1) -->B(Predictor X - time 2)
    D(Predictor Y - time 1) -->C(Predictor Y - time 2)

model <- 
'
VAR1_T2 + VAR2_T2 ~ VAR1_T1 + VAR2_T1
'

fit <- sem(model, data = df)
summary(fit, 
        fit.measures = TRUE, 
        standardized = TRUE, 
        rsquare = TRUE)
modificationIndices(fit, sort.=TRUE, minimum.value=3)

Within-subject

Flournoy wrote an amazingly useful package to generate a syntax for a RI-CLPM in the riclpmr package.

library(devtools)
# install_github('jflournoy/riclpmr') # in case you have not installed the package yet
library(riclpmr)
library(lavaan)

Select the variables from your wide dataset.

data_riclpm <- dfwide[,c("ID", # also important to include the grouping variable
                         'VARIABLEX_1','VARIABLEX_2','VARIABLEX_3','VARIABLEX_4',
                         'VARIABLEY_1','VARIABLEY_2','VARIABLEY_3','VARIABLEY_4')]

# give different column names to make the output of the model more readable
colnames(data_riclpm) <- c("id",
                           'x1','x2','x3','x4',
                           'y1','y2','y3','y4')

data_riclpm <- data_riclpm[ , -c(1)] #remove ID
 
# refer which columns belong to a specific variable
var_groups <- list(
  x=c('x1','x2','x3','x4'),
  y=c('y1','y2','y3','y4')
)

Just run the following code. Herein, a constrained and an unconstrained model are performed and compared (via ANOVA). Based on which model provides the best fit of the data, you can check the output of the model.

# construct contraint model
model_text <- riclpmr::riclpm_text(var_groups,
                                   constrain_over_waves = TRUE,
                                   constrain_ints = "free")

fit_constraints <- riclpmr::lavriclpm(riclpmModel = model_text, 
                                      data = data_riclpm,
                                      missing = 'fiml', 
                                      meanstructure = T, 
                                      int.ov.free = T)
# construct unconstraint model
model_text <- riclpmr::riclpm_text(var_groups,
                                   constrain_over_waves = FALSE,
                                   constrain_ints = "free")

fit_noconstraints <- riclpmr::lavriclpm(riclpmModel = model_text, 
                                        data = data_riclpm,
                                        missing = 'fiml', 
                                        meanstructure = T, 
                                        int.ov.free = T)

# run this to compare constraint and unconstraint model in terms of data fit
anova(fit_constraints,fit_noconstraints)

# Check the output of the chosenmodel
summary(fit_constraints,
        fit.measures = TRUE,
        standardize = TRUE,
        rsquare = TRUE)